Search CORE

16 research outputs found

Supertree construction by matrix representation with flip

Author: Chen Duhong
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2003
Field of study

Digital Repository @ Iowa State University (ISU)

Improved Heuristics for Minimum-Flip Supertree Construction

Author: Burleigh J. Gordon
Chen Duhong
Eulenstein Oliver
Fernández-Baca David
Publication venue: Libertas Academica
Publication date: 01/01/2006
Field of study

The utility of the matrix representation with flipping (MRF) supertree method has been limited by the speed of its heuristic algorithms. We describe a new heuristic algorithm for MRF supertree construction that improves upon the speed of the previous heuristic by a factor of n (the number of taxa in the supertree). This new heuristic makes MRF tractable for large-scale supertree analyses and allows the first comparisons of MRF with other supertree methods using large empirical data sets. Analyses of three published supertree data sets with between 267 to 571 taxa indicate that MRF supertrees are equally or more similar to the input trees on average than matrix representation with parsimony (MRP) and modified min-cut supertrees. The results also show that large differences may exist between MRF and MRP supertrees and demonstrate that the MRF supertree method is a practical and potentially more accurate alternative to the nearly ubiquitous MRP supertree method

Directory of Open Access Journals

PubMed Central

PhyloFinder: An intelligent search engine for phylogenetic tree databases

Author: Bansal Mukul S
Burleigh J Gordon
Chen Duhong
Fernández-Baca David
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Bioinformatic tools are needed to store and access the rapidly growing phylogenetic data. These tools should enable users to identify existing phylogenetic trees containing a specified taxon or set of taxa and to compare a specified phylogenetic hypothesis to existing phylogenetic trees. Results PhyloFinder is an intelligent search engine for phylogenetic databases that we have implemented using trees from TreeBASE. It enables taxonomic queries, in which it identifies trees in the database containing the exact name of the query taxon and/or any synonymous taxon names, and it provides spelling suggestions for the query when there is no match. Additionally, PhyloFinder can identify trees containing descendants or direct ancestors of the query taxon. PhyloFinder also performs phylogenetic queries, in which it identifies trees that contain the query tree or topologies that are similar to the query tree. Conclusion PhyloFinder can enhance the utility of any tree database by providing tools for both taxonomic and phylogenetic queries as well as visualization tools that highlight the query results and provide links to NCBI and TBMap. An implementation of PhyloFinder using trees from TreeBASE is available from the web client application found in the availability and requirements section.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Algorithms in supertree inference and phylogenetic data mining

Author: Chen Duhong
Publication venue
Publication date: 01/01/2007
Field of study

Science and society would benefit enormously from comprehensive phylogenetic knowledge of the Tree of Life (ToL), a framework that includes above 1.7 million species on Earth. ToL shows how living things have evolved since the origins of life billions of years ago. With existing computational approaches, we cannot produce a global estimate of evolutionary history from molecular data of all these species. Therefore, the development of computer algorithms and methods is critical to the field of phyloinformatics. The goal of this dissertation is to design, analyze, and implement algorithms to solve three specific problems in the field of phyloinformatics to support ToL studies.;The first part of this dissertation is about algorithms for the matrix representation with flipping (MRF) method to construct large subtrees of the ToL. The utility of the MRF supertree method has been limited by the speed of its heuristic algorithms. We describe a new heuristic algorithm for MRF supertree construction that improves upon the speed of the previous heuristic by a factor of n (the number of taxa in the supertree). This new heuristic makes MRF tractable for large-scale supertree analyses and allows the first comparisons of MRF with other supertree methods using large empirical data sets. Analyses of three published supertree data sets indicate that MRF supertrees are equally or more similar to the input trees on average than matrix representation with parsimony (MRP) and modified mincut supertrees. The results also show that large differences may exist between MRF and MRP supertrees, and demonstrate that the MRF supertree method is a practical and potentially more accurate alternative to the nearly ubiquitous MRP supertree method.;The second part of this dissertation is dedicated to new algorithms for partitioning phylogenetic data sets. We describe two new methods to partition phylogenetic data sets of discrete characters based on pairwise compatibility. The partitioning methods make no assumptions regarding the phylogeny, model of evolution, or characteristics of the data. The methods first build a compatibility graph, in which each node represents a character in the data set. Edges in the compatibility graph may represent strict compatibility of characters or they may be weighted based on a fractional compatibility scoring procedure that measures how close the characters are to being compatible. Given the desired number of partitions, the partitioning methods then seek to cluster together the characters with the highest average pairwise compatibility, so that characters in each cluster are more compatible with each other than they are with characters in the other cluster(s). We demonstrate that the spectral partitioning effectively identifies characters with different evolutionary histories in simulated data sets, and it is better at highlighting phylogenetic conflict within empirical data sets than previously used partitioning methods.;The third part of this dissertation describes a new intelligent search engine, named PhyloFinder, for phylogenetic databases that we have implemented using trees from TreeBASE. PhyloFinder enables taxonomic queries, in which it identifies trees in the database containing the exact name of the query taxon and/or any synonymous taxon names and provides spelling suggestions for the query when there is no match. Additionally, PhyloFinder can identify trees containing descendants or direct ancestors the query taxon by making use of the ontology guide from NCBI taxonomy database. PhyloFinder also executes phylogenetic queries, in which it identifies trees that contain the query tree or topologies that are similar to the query tree. PhyloFinder can enhance the utility of any tree database by providing tools for both taxonomic and phylogenetic queries as well as visualization tools that highlight the query results and provide links to NCBI and TBMap.</p

Digital Repository @ Iowa State University (ISU)

Supertree construction by matrix representation with flip

Author: Chen Duhong
Publication venue
Publication date: 01/01/2003
Field of study

Digital Repository @ Iowa State University (ISU)

Supertrees by Flipping

Author: Chen Duhong
Eulenstein Oliver
Fernández-Baca David
Sanderson Michael
Publication venue
Publication date: 01/10/2001
Field of study

There are now many thousand evolutionary trees, each of which constitutes a small subtree of the tree of life. We present an approach that allows us to integrate subtrees into larger trees in an effort to assemble the supertree of life. Topological inconsistency due to errors in phylogenetic estimation complicate the task of making supertrees. To tackle this problem we solve the following optimization problem: Given a set of trees that intersect pairwise in their leaf sets --- a grove --- find a tree --- a supertree --- that has the smallest distance to the trees in the grove. We introduce a new distance measure, called flip-distance, study the complexity of computing it, and show its performance using simulations.</p

Digital Repository @ Iowa State University (ISU)